58 PART 2 Examining Tools and Processes
There are both advantages and disadvantages to using these online commercial
platforms. Advantages include that online software tends to follow a cheaper sub-
scription paid monthly or annually, and you get continuous upgrades because the
software is web based. The main downside is these platforms have a high learning
curve and require a lot of work to fully adopt, so you have to ask yourself if it
makes sense with your project.
Focusing on Open-Source
and Free Software
Open-source software refers to software that has been developed and supported by
a user community. Although open-source software has licenses, they are typically
free but require you to adhere to certain policies when using the software. In this
section, we talk about the two most popular open-source statistical software
packages: R and Python.
Open-source software
The two most popular and extensive open-source statistical programs are R and
Python.»
» R: R is statistical software that has been developed and is maintained by the R
user community. It has two interfaces: R GUI, which looks similar to PC SAS
and SPSS, and RStudio, which is an integrated development environment
(IDE). Analysts prefer to use RStudio when developing graphical displays for
the web, while R GUI is fine for most statistical work. To run R, you download
and install the base application. Then, for specified functions not included in
the base application, you install additional R packages. Like with PC SAS, in R,
you import or connect to datasets, develop and save code files to run on
those datasets, and produce output you can save. Base R, R packages, and
documentation are available on the Comprehensive R Archive Network
(CRAN) server at https://cran.r-project.org.»
» Python: Python is an open-source programming language that is often used
to analyze data. As with R, Python is developed and maintained by its own
user community and runs in a similar way. Although you still develop code
that runs against datasets in the Python environment, the Python and R code
are different. Instead of packages as in R, Python has libraries. Python is
available at www.python.org/downloads.